Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add filters

Language
Document Type
Year range
1.
biorxiv; 2023.
Preprint in English | bioRxiv | ID: ppzbmed-10.1101.2023.04.19.537514

ABSTRACT

The COVID-19 pandemic has seen large-scale pathogen genomic sequencing efforts, becoming part of the toolbox for surveillance and epidemic research. This resulted in an unprecedented level of data sharing to open repositories, which has actively supported the identification of SARS-CoV-2 structure, molecular interactions, mutations and variants, and facilitated vaccine development and drug reuse studies and design. The European COVID-19 Data Platform was launched to support this data sharing, and has resulted in the deposition of several million SARS-CoV-2 raw reads. In this paper we describe (1) open data sharing, (2) tools for submission, analysis, visualisation and data claiming (e.g. ORCiD), (3) the systematic analysis of these datasets, at scale via the SARS-CoV-2 Data Hubs as well as (4) lessons learned. As a component of the Platform, the SARS-CoV-2 Data Hubs enabled the extension and set up of infrastructure that we intend to use more widely in the future for pathogen surveillance and pandemic preparedness.


Subject(s)
COVID-19
2.
medrxiv; 2022.
Preprint in English | medRxiv | ID: ppzbmed-10.1101.2022.02.07.22270243

ABSTRACT

Background Outbreak strains are good candidates to look for intrinsic transmissibility as they are responsible for a large number of cases with sustained transmission. However, assessment of the success of long-lived outbreak strains has been flawed by the use of low-resolution typing methods and restricted geographical investigations. We now have the potential to address the nature of outbreak strains by combining large genomic datasets and phylodynamic approaches. Methods We retrospectively sequenced the whole genome of representative samples assigned to an outbreak circulating in the Canary Islands (GC) since 1993; accounting for ∼20% of local TB cases. We selected a panel of specific SNP markers to in-silico search for additional outbreak related sequences within publicly-available TB genomic data. Using this information we inferred the origin, spread and epidemiological parameters of the GC-outbreak. Findings Our approach allowed us to accurately trace both the historical and recent dispersion of the strain. We evidenced its high success within the Canarian archipelago but found a limited expansion abroad. Estimation of epidemiological parameters from genomic data contradicts a distinct biology of the GC-strain. Interpretation With the increasing availability of genomic data allowing for an accurate inference of strain spread and key epidemiological parameters, we can now revisit the link between Mycobacterium tuberculosis genotypes and transmission, as routinely done for SARS-CoV-2 variants of concern. We show that the success of the GC-strain is better explained by social determinants rather than intrinsically higher bacterial transmissibility. Our approach can be used to trace and characterize strains of interest worldwide. Funding European Research Council (101001038-TB-RECONNECT), the Ministerio de Economía, Industria y Competitividad (PID2019-104477RB-I00), Instituto de Salud Carlos III (FIS18/0336), European Commission –NextGenerationEU (Regulation EU 2020/2094), through CSIC’s Global Health Platform (PTI Salud Global) to IC. Gobierno de Aragón/Fondo Social Europeo “Construyendo Europa desde Aragón” to SS Research in context Evidence before this study Identification of intrinsically highly transmissible strains of Mycobacterium tuberculosis remains elusive. Among candidates are those strains that have been thriving in a community for decades representing a significant contribution to the long-term local TB burden. These long-lived outbreak strains have been identified in different parts of the world and the speculation is that their success is linked to higher transmissibility. Several studies have attempted to analyze the epidemiological characteristics of these strains as well as their genomic composition to look for potential transmission determinants. However those studies are usually circunscribe to their original geographic boundaries. By contrast, this transmissibility should be replicated in different parts of the world, a lesson learnt from SARS-CoV-2 variants of concern. Previous attempts failed to examine the success of these outbreak strains at a global scale. Thus, it is unknown whether the long-lived outbreak strains had a similar or different trajectory in other countries, casting doubts about their transmissibility potential. Added value of this study Here we analyzed a strain causing a long-lived outbreak in the Canary Islands since 1993 using whole genome sequencing. As in previous studies with other similar outbreak strains, we analyzed the diversity and phylodynamics of the outbreak in the area where it was originally described. However, thanks to the possibility of interrogating the entire European Nucleotide Archive, we had the unique chance to look at the spread of the strains beyond its original geographic boundaries. This approach allowed us to comprehensively trace the real spatio-temporal spread of the outbreak from the emergence of its ancestor about 700 years ago to its recent transmission outside the Canary Islands. As a result, there is limited evidence for similar success of the strains outside Canary Islands. Furthermore, we complemented the analysis with epidemiological data of the early cases and with phylodynamic analysis to estimate key epidemiological parameters linked to the strain spread. All evidence strongly suggests that factors related to the host, instead of the bacteria, are behind the persistence and expansion of the outbreak strain. Implications of all the available evidence Infectious disease outbreaks are a major problem for public health. Tracing outbreak expansion and knowing the main factors behind their emergence and persistence are key to an effective disease control. Our study allows researchers and public health authorities to use WGS-based methods to trace outbreaks, and include available epidemiological information to evaluate the factors underpinning outbreak persistence. Taking advantage of all the information freely available in public repositories, researchers can accurately establish the expansion of the outbreak behind its original boundaries; and they can determine the potential risk of the strain to inform health authorities which, in turn, can define target strategies to mitigate its expansion and persistence. Finally, we show the need to evaluate strain transmissibility in different geographic contexts to unequivocally associate its spread to local or pathogen factors, a major lesson taken from SARS-CoV-2 genomic surveillance.


Subject(s)
Tuberculosis
3.
biorxiv; 2022.
Preprint in English | bioRxiv | ID: ppzbmed-10.1101.2022.01.21.477194

ABSTRACT

Viral sequence data from clinical samples frequently contain human contamination, which must be removed prior to sharing for legal and ethical reasons. To enable host read removal for SARS-CoV-2 sequencing data on low-specification laptops, we developed ReadItAndKeep, a fast lightweight tool for Illumina and nanopore data that only keeps reads matching the SARS-CoV-2 genome. Peak RAM usage is typically below 10MB, and runtime less than one minute. We show that by excluding the polyA tail from the viral reference, ReadItAndKeep prevents bleed-through of human reads, whereas mapping to the human genome lets some reads escape. We believe our test approach (including all possible reads from the human genome, human samples from each of the 26 populations in the 1000 genomes data, and a diverse set of SARS-CoV-2 genomes) will also be useful for others. ReadItAndKeep is implemented in C++, released under the MIT license, and available from https://github.com/GenomePathogenAnalysisService/read-it-and-keep .


Subject(s)
Blindness
SELECTION OF CITATIONS
SEARCH DETAIL